Relative effectiveness of medications for opioid-related disorders: A systematic review and network meta-analysis of randomized controlled trials

Introduction Several pharmacotherapeutic interventions are available for maintenance treatment for opioid-related disorders. However, previous meta-analyses have been limited to pairwise comparisons of these interventions, and their efficacy relative to all others remains unclear. Our objective was to unify findings from different healthcare practices and generate evidence to strengthen clinical treatment protocols for the most widely prescribed medications for opioid-use disorders. Methods We searched Medline, EMBASE, PsycINFO, CENTRAL, and ClinicalTrials.gov for all relevant randomized controlled trials (RCT) from database inception to February 12, 2022. Primary outcome was treatment retention, and secondary outcome was opioid use measured by urinalysis. We calculated risk ratios (RR) and 95% credible interval (CrI) using Bayesian network meta-analysis (NMA) for available evidence. We assessed the credibility of the NMA using the Confidence in Network Meta-Analysis tool. Results Seventy-nine RCTs met the inclusion criteria. Due to heterogeneity in measuring opioid use and reporting format between studies, we conducted NMA only for treatment retention. Methadone was the highest ranked intervention (Surface Under the Cumulative Ranking [SUCRA] = 0.901) in the network with control being the lowest (SUCRA = 0.000). Methadone was superior to buprenorphine for treatment retention (RR = 1.22; 95% CrI = 1.06–1.40) and buprenorphine superior to naltrexone (RR = 1.39; 95% CrI = 1.10–1.80). However, due to a limited number of high-quality trials, confidence in the network estimates of other treatment pairs involving naltrexone and slow-release oral morphine (SROM) remains low. Conclusion All treatments had higher retention than the non-pharmacotherapeutic control group. However, additional high-quality RCTs are needed to estimate more accurately the extent of efficacy of naltrexone and SROM relative to other medications. For pharmacotherapies with established efficacy profiles, assessment of their long-term comparative effectiveness may be warranted. Trial Registration This systematic review has been registered with PROSPERO (https://www.crd.york.ac.uk/prospero) (identifier CRD42021256212).


Introduction
Opioid use disorder (OUD), including opioid dependence and addiction, is a problematic opioid use commonly characterized by tolerance and withdrawal symptoms. Adverse health outcomes associated with OUD include overdose, infectious diseases (e.g., AIDS; hepatitis C; and skin, soft tissue, and vascular infections), suicide, and death in severe cases [1][2][3][4][5]. A recent study on OUD burden revealed a global estimate of 40.5 million people who suffer from opioid dependence and 109,500 deaths from opioid overdose (including both accidental and intentional cases) in 2017 [6]. In response to the severity of OUD disease burden, several treatment pathways have emerged, with opioid maintenance therapy (OMT) as the most effective strategy.
OMT involves the use of opioid substitutes (e.g., buprenorphine, methadone, or naltrexone) to treat and manage addiction to opioids such as heroin, fentanyl, hydromorphone, and oxycodone [7]. Compared to other treatment pathways (e.g., detoxification, residential services, and behavioural interventions) or no treatment, OMT has lower risk of overdose hospitalizations [8], mortality [9], and frequency of injection drug use and needle sharing [10][11][12]. With mounting evidence of its effectiveness, OMT has undoubtedly become the gold standard for treating opioid addiction and dependence, and public health stakeholders around the world have sought to expand access to medications for OUD in order to reduce the individualand population-level burdens of OUD [13][14][15].
Researchers from the United States, Canada, and European nations (e.g., the UK, France, Germany, Spain, and Finland) have adopted guidelines or reached a consensus to prescribe buprenorphine and methadone as first-line OMT treatments against OUD [15][16][17][18]. At the same time, there have been growing interests in understanding the efficacy of other medications for OUD, such as naltrexone [19,20] and slow-release oral morphine (SROM) [21][22][23]. For example, recent clinical practice guidelines from Canada and the United States have outlined expectations of professional conduct in relation to prescribing these medications in addition to buprenorphine and methadone [17,18]. These OMT guidelines reflect advances in clinical practice, and the availability of a wider range of therapeutic options for OMT medications may reduce OUD burden [14]. However, no review to date has compared treatment efficacy between several different OMT medications. Despite the availability of multiple medications, previous meta-analyses have been limited to pairwise comparisons of buprenorphine vs. methadone [24], SROM vs. methadone [21], or OMT medications vs. no maintenance treatment [20,25]. In the absence of randomized controlled trials (RCT) comparing other combinations of treatment pairs, especially those involving naltrexone or SROM, their treatment efficacy relative to that of buprenorphine or methadone remains poorly understood. We undertook a systematic review with a network meta-analysis to establish which medications out of buprenorphine, methadone, naltrexone, and SROM have superior efficacy profile. This study unifies findings from different healthcare practices around the world and generates evidence to strengthen clinical treatment protocols for the most widely prescribed OMT medications.

Materials and methods
We conducted this systematic review in accordance with a pre-specified protocol (PROSPERO number: CRD42021256212), and we reported our findings following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses for Network Meta-Analyses (PRIS-MA-NMA) (S1 Table) [26].

Search strategy
We conducted a systematic literature search of studies comparing medications for OUD among people with opioid-related disorders in MEDLINE, EMBASE, PsycINFO, and Cochrane CENTRAL databases. We also scanned the bibliographies of the included articles and searched through the first 10 pages of Google Scholar (search terms: 'randomized controlled trials opioid use disorder') to identify additional references in the grey literature. We tailored our search strategy to each database, and search terms and keywords included those related to buprenorphine, methadone, naltrexone, slow-release oral morphine, and opioidrelated disorders (see S2 Table for Search Strategy). There were no restrictions on the language of publication. Initial search was conducted on June 7, 2021, and it was updated on February 12, 2022.

Inclusion and exclusion criteria
Population. This study included adult patients with problematic opioid or heroin use who were receiving pharmacotherapy, including buprenorphine, methadone, naltrexone, and SROM. Problematic use included OUD, opioid dependence, and opioid addiction as well as conditions specific to the use of heroin. These conditions were ascertained by the Diagnostic and Statistical Manual of Mental Disorders (DSM-3, -4, or -5) or the International Classification of Diseases (ICD) criteria. We excluded women who were pregnant in order to mitigate heterogeneity in study population that could arise from scientific and ethical complexities involving these patient population groups.
Interventions and comparators. The main intervention for this study was pharmacotherapy for OUD including buprenorphine, methadone, naltrexone, and slow-release oral morphine. We included studies that compared any of the described intervention medications with each other or with a control group (e.g., placebo, standard of care, or no treatment). For inclusion, these interventions needed to be administered to treat OUD. Studies with one of the four intervention medications in one arm but another medication that is not one of the four intervention medications or controls (e.g., heroin or clonidine) were excluded. This was because they were not described in the latest best practice guidelines and consensuses for opioid maintenance treatment, and therefore did not reflect the most up-to-date clinical practice around the world (e.g., North America and Europe). In addition, we excluded from our analyses RCTs that compared different treatment regimens of the same medication, including studies where only the dosage, setting, or the route of administration (e.g., sublingual vs. injectable) differed between treatment groups.
For the multi-arm trials, we took the following analytic approach to evidence synthesis. First, in studies with three or more treatment arms, we extracted all intervention information but included only the eligible arms for evidence synthesis, following the approach taken by Rice et al. 2020 [27]. Second, in studies with two or more treatment arms with the same intervention (e.g., two arms with buprenorphine and two arms with methadone), we categorized each arm as 'low dose', 'moderate dose', or 'high dose' and treated two intervention arms with different medications that have the same categorization as a separate study. For example, if the four intervention arms were low dose buprenorphine, low dose methadone, moderate dose buprenorphine, and moderate dose methadone, the two low dose arms were included in the analysis as one study and the two moderate dose arms as another study. Third, in other trials with a small sample size in each arm, we combined multiple treatment arms with the same intervention into a larger single arm, if applicable, as had been done in earlier meta-analyses [20,24,25].
Outcomes. The primary outcome of interest was treatment retention at the end of the study. We defined treatment retention for each intervention or control arm as the number of participants who completed or remained in the study (did not withdraw) divided by the total number of participants who were randomized to a specific treatment in the beginning of the study. The secondary outcome of interest was opioid use based on urinalysis. Opioid use could be reported as either abstinence from opioids or illicit opioid use (e.g., heroin), and the definition of abstinence could vary by study duration or the number of times of opioid use. Therefore, for each intervention or control arm, we defined opioid use in two different ways: (1) Percentage of urine samples that were positive for opiates at the end of the study, and (2) Number of patients who had at least one urine sample that was positive for opiates at the end of the study out of all those randomized to each arm in the beginning of the study.
Study design. We included RCTs that compared any of the four medications for OUD, namely buprenorphine, methadone, naltrexone, and SROM against each other for any treatment regimen (dose, frequency, timing, duration, and route of administration), placebo, standard of care, or no treatment. We restricted our study design to RCTs because they were best suited to address the relative efficacy of the pharmacotherapies on retention and opioid use, while eliminating confounding present in other study designs. Consequently, observational studies, case series, and case reports were excluded from our analysis. To identify additional eligible studies, we inspected earlier systematic reviews and meta-analyses, but these systematic reviews and meta-analyses themselves were not included in the qualitative and quantitative evidence syntheses.

Screening
Two independent reviewers (JL and IF) screened titles and abstracts, and then conducted a full-text review of all articles retrieved from the databases for eligibility (study selection) based on specified inclusion/exclusion criteria. We conducted title and abstract screening using a liberal accelerated approach, in which only one reviewer was needed to include a citation, while two reviewers were needed to exclude a citation [28]. At this stage, articles that were judged to be potentially relevant underwent a full-text review, which two reviewers performed independently and in duplicate. Disagreements between the two reviewers were resolved through consensus or adjudication by a third independent reviewer (DP), if necessary.

Data extraction
Two reviewers (JL and IF) conducted data extraction using a pre-piloted, standardized data extraction form (S3 Table). Data were independently extracted by a single reviewer (JL) and were verified by a second reviewer (IF). Disagreements were resolved through consensus or adjudication by a third independent reviewer (DP), if necessary. Extracted information included publication traits (year, setting/country, and funding source), study characteristics (trial design, trial duration, intervention type, comparator type, treatment regimen for each arm, number of participants randomized [total and for each arm], inclusion/exclusion criteria), basic participant characteristics (age and sex), and outcome data (number of participants retained and results from urinalysis for detecting opiates, including morphine and heroin). If multiple studies reported on the same cohort, we included only the most recent article with the most up-to-date study information.

Risk of bias assessment
Two reviewers (JL and IF) independently assessed the risk of bias for all included studies. We used Version 2 of the Cochrane Risk of Bias (RoB 2) tool to examine potential biases in five domains [29]. These include biases in randomization process, deviations from intended intervention, missing outcome data, measurement of the outcome, and selection of the reported results. Based on the assessment of each domain, we determined the overall risk of bias as one of 'low risk of bias', 'some concerns', and 'high risk of bias', where the overall risk was determined by the highest risk assigned in any individual domain.

Review of network geometry
We constructed network graphs to visualize the overall structure of the comparisons in our network. The nodes of the network graph represent the competing treatments, and an edge connects them if at least one RCT compared these corresponding treatments. We examined which treatments were compared directly (head-to-head comparisons) or indirectly (through one or more common comparators) and the amount of evidence generated from each comparison.

Strategies for evidence synthesis
We conducted a network meta-analysis of available direct and indirect evidence using a Bayesian framework to account for the correlation between treatment effects by different comparisons in multi-arm trials [30]. We chose to conduct network meta-analysis using the Bayesian framework, as this paradigm allows for probability statements [31], such as "There is X% probability that treatment A is the most efficacious out of all treatments".
For treatment retention and opioid use, the treatment effect measure was the risk ratio (RR). To estimate the posterior distribution of the treatment effects, we conducted a Markov Chain Monte Carlo (MCMC) simulation with 100,000 iterations total and 5,000 burn-in iterations. To derive the posterior distribution, we ran both random-effects and fixed-effects models with an uninformative prior distribution of treatment effects. Next, we determined model fit using the deviance information criterion (DIC), where smaller DIC values correspond to better fit [32]. We then assessed the convergence of the MCMC simulations using the Gelman-Rubin-Brooks plot and the potential scale reduction factor (PSRF) [33,34], where we determined that convergence was reached if the PSRF < 1.05.
We analyzed the outcomes by constructing the model for binary endpoints for network meta-analysis, and we calculated the RR with a 95% credible interval (CrI). To rank the preference of treatment options in the study, we plotted a rankogram and calculate the Surface Under the Cumulative Ranking (SUCRA) score [35]. The SUCRA score ranges from 0 to 1, where values closer to 1 indicate the more preferred treatments. Finally, we used the node split method to evaluate consistency of our network model, where consistency refers to the concordance of results between direct and indirect estimates within the network meta-analytic model. All analyses were conducted using R Studio version 4.0.5.

Sensitivity analysis
To assess whether study characteristics are associated with effect size differences, we conducted univariate network meta-regression analyses by the overall risk of bias (Low Risk vs. Some Concerns or High Risk from RoB 2) and by the year of study publication (Before 2010 vs. On or After 2010). We conducted an MCMC simulation with the same conditions as above and compared model fit with the main analysis model using the DIC. We also examined the credibility of the network meta-analysis results using the Confidence in Network Meta-Analysis (CINeMA) tool to improve transparency and limit subjectivity in the evidence synthesis process [36]. For each pairwise comparison, both direct and indirect, we specifically assessed six domains: (1) within-study bias, (2) reporting bias, (3) indirectness, (4) imprecision, (5) heterogeneity, and (6) incoherence. We evaluated each domain and rated it as 'No concerns', 'Some concerns', or 'Major concerns' based on a set of criteria described below. For all possible pairwise comparisons across the six domains, we took the average of the level of concerns to assign a rating. For additional details, please refer to S1 Text.

Search results
The database searches identified 13,262 publications (Fig 1). We removed 4,915 duplicates and an additional 7,915 studies upon review of the title and abstract. These studies were excluded because they were irrelevant to the research question or had observational study designs. Two reviewers independently examined the remaining 432 articles and excluded 360 of them (see S4 Table for the list of articles for full-text review and reasons for exclusion). In addition to the 72 RCTs that met our inclusion criteria, we identified 7 additional studies through grey literature search and review of earlier meta-analyses [37-43]. In total, 79 studies were included for the qualitative and quantitative evidence syntheses . Table 1 summarizes the key demographic and study characteristics of the RCTs included whose duration ranged from 2 weeks to 1 year. Additional details of each study can be found in the S5    have higher risk of bias due to deviations from the intended interventions because a large number of patients failed to remain and adhere to the originally assigned treatment shortly after randomization [44,114]. Relatedly, four studies that examined opioid use as their main endpoint were at higher risk of bias due to missing outcome data because they employed perprotocol analyses that excluded participants who were lost to follow-up before the end of the study period [43, 69,103,114]. Since both treatment retention and opioid use were measured objectively, the risk of bias in the measurement of outcome was judged 'Low Risk' for all studies, and the risk of bias from selective reporting remains low. However, due to the lack of available protocol in the grant (e.g., NIH REPORTER) or trial registry (e.g., ClinicalTrials.gov), the risk of reporting bias was judged 'Some

Evidence synthesis
Due to differences in approach to measurement of opioid use (e.g., differences in defining abstinence) even between studies using urinalysis to ascertain it, we judged that the network meta-analysis (NMA) would not produce reliable estimates for the secondary outcome. Therefore, we only conducted NMA for treatment retention (primary outcome).    and model fit comparison). The MCMC simulation with the random effects model resulted in the PSRF of 1.0037, suggesting convergence of the MCMC algorithm (i.e., the simulation resulted in an accurate estimate of our parameters; see S1 and S2 Figs for detailed results of the simulation).
All pharmacotherapy options were more efficacious with respect to treatment retention than the control group. Compared to the control group, the likelihood of retention was 2.62 (95% CrI = 2.09-3.33), 2.52 (95% CrI = 1.62-3.94), 2.15 (95% CrI = 1.76-2.69), and 1.54 (95% CrI = 1.26-1.90) times higher for methadone, SROM, buprenorphine, and naltrexone, respectively (Fig 3). Further, the average percentage of treatment retention across all studies was 77.6% for SROM, 64.1% for methadone, 54.3% for buprenorphine, 41.0% for naltrexone, and  Table 3). Methadone had a higher likelihood of retention than buprenorphine (RR = 1.22; 95% CrI = 1.06-1.40) and naltrexone (RR = 1.69; 95% CrI = 1.30-2.24) but remained statistically equivalent to SROM (RR = 1.04; 95% CrI = 0.71-1.52). Similarly, buprenorphine (RR = 1.39; 95% CrI = 1.10-1.80) and SROM (RR = 1.63; 95% CrI = 1.01-2.63) both had a higher likelihood of retention than naltrexone, but the two medications remained equivalent (RR = 0.86; 95% CrI = 0.57-1.28 with SROM as the reference group) .  Fig 4 illustrates forest plots of direct and indirect estimates of likelihood of treatment retention from the node split method, which was used to evaluate consistency of our network model. Results from six pairwise comparisons involving buprenorphine, methadone, naltrexone, and control group were presented. Pairwise comparisons involving SROM were not included in this node split approach because methadone was its only comparator, and SROM was disconnected from all other interventions including the control group in the network graph. For all pairwise comparisons except naltrexone versus methadone, the estimates from  direct and indirect comparisons were statistically equivalent (p-value > 0.05). However, for the naltrexone versus methadone pair, there was some evidence of inconsistency (pvalue = 0.027), as the magnitude of the RR was larger for the direct estimate than for the indirect estimate.
As for sensitivity analysis, model fit from univariate network meta-regression analysis adjusting for risk of bias and publication year remained similar to that from the unadjusted analysis (DIC: Unadjusted = 294.58; Risk of bias = 293.46; Publication year = 294.66). Thus, it remains unlikely that controlling for these study characteristics and methodological differences between studies influenced the magnitude of the effect sizes in our network. Using the CINeMA tool (Table 4, S5 and S6 Figs), we judged with 'High' level of confidence the estimates of the likelihood of treatment retention comparing buprenorphine vs. control and methadone vs. control. For the buprenorphine vs. methadone and buprenorphine vs. naltrexone comparisons, we assigned a 'Moderate' confidence rating due to the presence of heterogeneity. For the methadone vs. naltrexone comparison, we assigned a confidence rating of 'Low' due to having a small number of studies as well as concerns related to the 'Heterogeneity' and 'Incoherence' domains for the estimates derived from the network. Finally, all pairwise comparisons involving SROM received the confidence rating of 'Very Low' due to high degree of incoherence that resulted from insufficient evidence in the network. Synthesis of secondary outcomes. Due to both small number of studies reporting secondary outcomes and inconsistent reporting formats, we were unable to meta-analyze studies reporting these outcomes. We present below details of the post-hoc analysis of the secondary outcomes.
Thirty  61,62,69,74,87,98,103,105,114] assessed opioid use as the number of patients who had at least one opioid positive urine sample at the end of the study out of all those randomized to each arm in the beginning of the study; and three studies [61,62,98] applied both definitions. In addition, there were heterogeneities in reporting opioid use across studies. Among the studies that reported percentage of opioid positive urine samples, nine reported the total number of tests and number of positive urine samples [50,59,61,62,76,79,82,88,94], eight reported percentages with a 95% CI or standard error [47,50,54,63,76,82,97,115], and six reported percentages with a standard deviation [60, 62,83,95,101,110]. However, 13 studies only reported the percentage value without other parameters to assess variability of the sample [51, 57, 65,71,72,78,96,98,102,107,108,111,112].

Discussion
All pharmacotherapeutic strategies were associated with a higher likelihood of treatment retention compared to the control arm, which consisted of placebo, standard of care, or no treatment. Based on SUCRA rankings, methadone appeared to be the most effective pharmacotherapy for treatment retention, followed by SROM, buprenorphine, and naltrexone. Buprenorphine was superior to naltrexone, and naltrexone was superior to nonpharmacotherapeutic interventions in treatment retention. SROM was compared only with methadone, and all other pairwise comparisons involving SROM were based on indirect evidence. The lack of available evidence on direct comparison also raised major concerns with respect to the 'Incoherence' domain in the CINeMA, which in turn lowered confidence in the estimates related to SROM generated by the NMA. To our knowledge, this is the first network meta-analysis of the medications for opioid use disorder to examine their relative efficacy against each other. The findings from this study remain consistent with those from earlier meta-analyses by Mattick et al. (2014) and Mattick et al. (2009) [24,25], which concluded that buprenorphine and methadone were associated with higher treatment retention than non-pharmacotherapeutic controls and that methadone resulted in superior treatment retention to buprenorphine. At the same time, our study provides an important update to an earlier Cochrane review by Minozzi et al. (2011) [20]. This review concluded that naltrexone did not result in higher retention than placebo or non-pharmacotherapeutic agents, and that naltrexone was non-inferior to buprenorphine in treatment retention from only one study [100]. The authors' conclusions were based on a limited number of studies that were published before June 2010. With additional evidence, we found that naltrexone is an effective treatment strategy for retention compared to non-pharmacotherapeutic interventions, which contrasts findings by Minozzi et al. (2011). Further, our NMA generated preliminary evidence on the superiority of buprenorphine to naltrexone in treatment retention, supported by both the risk ratio estimates and the average proportion of retention calculated for each medication. However, additional RCTs will need to be conducted to better ascertain the efficacy of naltrexone in relation to methadone or SROM, since the network estimates were largely based on indirect comparisons.
Evidence on naltrexone from this network meta-analysis may also have important clinical implications in relation to the current guidelines that recommend buprenorphine and methadone as first-line treatments [15][16][17]. A 2015 guideline by the American Society of Addiction Medicine (ASAM) described buprenorphine, methadone, and naltrexone as more effective than non-pharmacotherapeutic strategies in treating OUD but concluded that their relative advantages over each other remain unknown [18]. Our NMA partially addresses this important gap raised in the ASAM guideline by generating evidence that demonstrates greater likelihood of treatment retention associated with buprenorphine than with naltrexone. In other words, should the clinicians prioritize retention while treating patients with OUD, buprenorphine may be the preferred treatment to naltrexone. On the other hand, although methadone was ranked higher than both buprenorphine and naltrexone using the SUCRA scores, the network estimates from the methadone-naltrexone pair would need to be interpreted more cautiously. Only one trial conducted a head-to-head comparison of these two medications, which subsequently raised some concerns around heterogeneity and incoherence of the NMA results. While methadone may have higher likelihood of treatment retention than naltrexone, the extent to which this is true may require further validation through additional RCTs.
Evidence on SROM from this network meta-analysis reveals the potential methodological issues surrounding earlier trials that contributed to the current clinical practice guidelines. In Canada, for example, clinical practice guidelines have stated that SROM could be a safe and effective alternative to treating OUD based on a small number of studies comparing SROM to methadone [17,116]. Two review studies reached similar conclusions by stating that treatment retention may be similar between SROM and methadone, but the authors of these studies also acknowledged that the methodological quality of the included studies may be low to moderate [21,22]. Our NMA results were consistent with these reviews and guidelines, as the number of high-quality RCTs comparing SROM to other medications for OUD was limited. However, through extensive examination of the risk of bias and credibility of the NMA estimates, we also observed that most of the trials comparing SROM to methadone had been industry-sponsored by the manufacturers of SROM, namely the Mundipharma Medical Company [47,56,84], which had not been detected in earlier reviews. Relatedly, the recommendations in the above guidelines may be based on studies with a potential for reporting bias, further lowering the confidence in the estimates from pairwise comparisons in the NMA involving SROM. Therefore, additional RCTs comparing SROM with other pharmacotherapeutic interventions are needed to better determine comparative effectiveness and facilitate treatment decisions for opioid maintenance treatments.
Our study has several strengths. First, we conducted a rigorous and extensive literature search across five databases and grey literature. Second, we included only the trials whose participants had opioid-related disorders and whose outcomes were measured objectively, which include treatment retention and opioid use measured by urinalysis. Although this may have reduced the number of studies in the network, we applied such eligibility criteria to ensure comparable study populations across studies. Third, the network meta-analysis included all pharmacotherapeutic strategies in the model. This allowed us to derive risk ratio estimates from both direct and indirect comparisons of all treatments in the network, which was not possible in the previous pairwise meta-analyses. Fourth, we examined the credibility of our network meta-analytic estimates using the CINeMA tool, a state-of-the-art platform that enabled a thorough assessment of the NMA results across six methodological domains.
Our study also has a few limitations. First, as with all network meta-analyses, randomization holds within each study but not between different studies [117], which could lead to heterogeneity in the study population across studies and subsequently yield biased results. However, we followed a pre-specified protocol with pre-determined eligibility criteria, which allowed us to mitigate between-study population differences. Second, due to heterogeneities between studies in outcome measurement and reporting formats, we were unable to meta-analyze trials on secondary outcomes. Although we conducted NMA only on the treatment retention outcome, findings on this endpoint remain clinically relevant because improvements in retention can reduce illicit opioid use as well as psychiatric, medical, and legal setbacks, while enhancing quality of life for people with OUD [118,119]. Third, evidence on comparisons involving naltrexone or SROM remained sparse, and confidence in the network meta-analytic estimates was low. This hindered us from drawing substantive conclusions on their efficacy and advantages in relation to the more widely prescribed buprenorphine and methadone. Fourth, about 78% of all participants across the studies in the network were male, which could compromise the generalizability of the study findings. We note, however, that the large majority of those who are treated for opioid use disorders are also male patients [13,[120][121][122]. Therefore, the generalizability of our study findings likely still holds true. Finally, long-term comparative effectiveness of the included pharmacotherapies remains uncertain because only RCTs were included in the network. However, the findings from our NMA serve as evidence that could pave the way for future observational studies, which are accepted as the best evidence for clinical and policy decision-making surrounding these medications [123].

Conclusion
For treating opioid-related disorders, maintenance treatment through buprenorphine, methadone, naltrexone, and SROM is more effective than non-pharmacotherapeutic interventions. Among the medications included in the network, methadone appears to be the most efficacious pharmacotherapy for treatment retention. Due to limitations in reporting and heterogeneity in outcome measurement formats, the relative efficacy of these interventions for other clinical endpoints remains unclear. Buprenorphine and methadone appear to have superior retention to naltrexone based on a small number of studies. Upon comparison with methadone, the efficacy of slow-release oral morphine was not statistically different. However, the lack of comparison with other pharmacotherapeutic options and the potential presence of reporting bias may hinder accurate conclusions about the efficacy of SROM. Finally, our study revealed directions for future research, which include (1) further RCTs involving naltrexone or SROM to assess their relative efficacy in relation to buprenorphine and methadone, and (2) observational studies to examine long-term comparative effectiveness of these medications.
Supporting information S1 Text. Analysis plan for using the CINeMA framework. (DOCX) S1 Table. PRISMA NMA checklist of items to include when reporting a systematic review involving a network meta-analysis of randomized controlled trials'.  Table. Deviance information criterion (DIC) for fixed effects model versus random effects model 0 and meta-regression analysis. 0 This was because the fixed effects model assumes that all studies share the same common effect, but it may not be reasonable to assume that there is one common effect size. On the other hand, the random effects model assumes that the observed estimates of the treatment effect can vary across studies due to systematic differences in the treatment effect as well as random variations due to chance [Riley, R.D., J.P. Higgins, and J.J. Deeks, Interpretation of random effects meta-analyses. BMJ, 2011. 342: p. d549.]. 1 Comparison with the random effects model. 2 Categorization of risk of bias (RoB) was based on whether the overall RoB was 'Low risk' or 'Some or High risk' using the RoB 2.0 tool. 3